Navigating ultrasmall worlds in ultrashort time 
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Random scale-free networks are ultra-small worlds. The average length of shortest paths in 
networks of size A'^ scales as In In TV. Here we show that these ultra-small worlds can be navigated in 
ultra-short time. Greedy routing on scale-free networks embedded in metric spaces finds paths with 
the average length scaling also as In In A*'. Greedy routing uses only local information to navigate a 
network. Nevertheless, it finds asymptotically shortest paths, direct computation of which requires 
global topology knowledge. Our findings imply that the peculiar structure of complex networks 
insures that the lack of global topological awareness has asymptotically no impact on the length of 
communication paths. These results have important consequences for communication systems such 
as the Internet, where maintaining knowledge of current topology is a major scalability bottleneck. 
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Random scale-free networks are ultra-small worlds [1,, 
0, The average and maximum lengths of shortest 
paths in scale-free networks with power-law degree dis- 
tributions, P{k) ~ A:~'^, 7 £ [2,3], scale with network 
size N as InlniV [l|, 0] [11]. However, finding such short- 
est paths requires global topology knowledge, which is 
not available to nodes in many real networks. It may 
seem surprising at first that, having no global topologi- 
cal awareness, nodes can find any paths to destinations at 
all. In [3] we address this apparent paradox by showing 
that the observed topological characteristics of complex 
networks maximize their navigability, measured by the 
efficiency of the greedy routing process. 

Greedy routing (GR) [1, SHBIi rehes on the hid- 
den metric space abstraction [10|. In this abstraction a 
network is embedded in a metric space, with distances 
in this space representing intrinsic node similarities. To 
route information to a given destination, a node forwards 
the information to its network neighbor closest to the 
destination in this space. This general mechanism under- 
lies processes ranging from search in social networks 
to protein folding [12| . The existence of hidden met- 
ric spaces under real networks in general is a conjecture, 
but we found empirical evidence of their existence for 
some real networks, such as the Internet or some social 
networks In other cases, the metric space may be 

visible. In the airport network, for example, this space is 
geographic 0, [S]. 

In [r3 |. numerical experiments show that scale- free 
networks are navigable in a wide region of parameters. 
Specifically, GR and its modifications are found to per- 
form generally well, in terms of the length and number 
of successful paths, on scale- free networks embedded in 
a plane. The GR^ efficiency is attributed to network het- 
erogeneity. In ^] the analytic results and simulations 
show that not only heterogeneity but also clustering af- 
fect strongly the GR efficiency. The thermodynamic limit 



is considered, and a network is called navigable if in this 
limit, GR can find paths for a macroscopic fraction of 
source-destination pairs. Navigable networks are shown 
to have sufficiently strong clustering and heterogeneity of 
node degrees, i.e., 7 w 2. 

Here we show analytically and in simulations that 
the average hop length of paths that GR produces in 
these navigable networks scales with network size as 
f ^ In In TV/ 1 In (7 — 2)1 . Given that the average length 
of shortest paths in these networks, as shown in fl, 2}, 
also scales as P In In TV/ 1 In (7 — 2) |, we conclude that 
the GR paths are asymptotically shortest. 

To obtain this result we use the generic class of mod- 
els introduced in These models generate scale- free 
networks embedded in metric spaces as follows. Given a 
target network size TV, first assign to all nodes their co- 
ordinates in the metric space, and an additional hidden 
variable k representing their expected degrees. To gener- 
ate scale-free networks, the variable k is power-law dis- 
tributed according to p{k) cc k^^, k G [kq, 00), where kq 
is the minimum expected degree. The metric space can 
be any homogeneous and isotropic D-dimensional space. 
Nodes are distributed in it with a uniform density 6 that 
is set to (5 = 1 without loss of generality. Then each pair 
of vertices i and j is connected by an edge with proba- 
bility r{x), X = dij / {^KiKj^/^ , where dij is the distance 
between the two vertices in the metric space, and and 
are their expected degrees. 

A proper choice of the parameter /i, which depends 
on a specific form of the connection probability r(x), 
guarantees that the average degree of vertices with hid- 
den variable k is ^(k) = k, so that k can indeed be 
identified with the degree. The exponent 7 in p(k) is 
then the power-law exponent of the degree distribution 
in the resulting networks These properties of the 

model hold for any dimension D of the metric space and 
for any form of the connection probability r{x), as long 
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FIG. 1: Illustration of the efficient greedy routing mechanism. 
The figure shows the current vertex v with its local neighbor- 
hood. The size of each vertex is proportional to its degree 
and the plane represents the underlying metric space. Vertex 
v' is the neighbor of i; which is closest to the target and also 
one of its furthest and highest-degree neighbors. At the next 
hop, greedy routing proceeds from v' to v" , reaching an even 
higher-degree vertex, traveling an even longer distance, and 
getting much closer to the target. 

as the integral x^~^r{x)dx is bounded. We thus 
have a very versatile class of models since we can in- 
dependently fix the average degree and the exponent 7 
without specifying the function r{x), which can then be 
used to control clustering in the network. For exam- 
ple, in [10] we use r{x) = (1 -I- x)~°', with a > D and 
/i = r(i:i/2)r(a)/[27r^/2(/c)r(a - D)T{D)]. This form 
of r{x) leads to the following two extremes. In the limit 
a D clustering vanishes. The network loses its metric 
properties and becomes equivalent to a random graph, 
where the probability that two nodes are connected de- 
pends only on their expected degrees, and not on the 
metric distance between them. In the opposite extreme 
a 00 clustering converges to a finite value, and the 
topology of the network is strongly influenced by the 
metric properties of the underlying space. This latter 
extreme yields networks lying in the navigable region. 

We next give an intuitive explanation, illustrated in 
Fig. [1] for why GR is efficient in these navigable net- 
works with strong coupling between network topology 
and underlying geometry. Suppose the GR process starts 
at some low-degree node and intends to reach a des- 
tination located far away in the metric space. Ideally 
the process should proceed to hubs, high-degree nodes, 
that likely cover long distances by their numerous con- 
nections. However, GR is degree-agnostic, it checks only 
underlying distances. Therefore, this ideal scenario with 
propagation through the hubs can only be implemented 
if the node's neighbor closest to the destination is also 
its highest-degree neighbor. But this condition is the 
more likely satisfied, the faster r{x) decreases, because 
the faster r{x) decreases, the stronger the dependency 
between a node's degree and the characteristic scale of 
distances that the node covers by its connections. This 



dependency is simple: the higher the degree of a node, 
the larger its characteristic distance scale. (In the non- 
navigable limit a — > Z?, this dependency disappears.) 
Consequently, if the next node along a path has a higher 
degree, then the node after the next one has an even 
higher degree, and the metric distance between these 
nodes also increases. On the other hand, the faster r{x) 
decreases (e.g., the larger a), the stronger clustering. We 
thus see that in the navigable case with strong cluster- 
ing, GR first travels over a sequence of nodes with in- 
creasing degrees and increasing inter-node distances. At 
some point, after the current distance to the destina- 
tion becomes comparable to the inter-node distance, this 
pattern changes, and the process completes in a finite 
number of hops. 

We now put this intuition on quantitative grounds. We 
first compute the probability that a node of expected 
degree k has a neighbor with expected degree k' at a 
distance d from it, P(k', Using results from 0,113, 
it is easy to show that this probability is 

p(.',dH=^d^-vf-4-^). (1) 

The marginal distribution with respect to k is 



for any function r{x) and any dimension D. We next 
compute the correlation between variables k' and d con- 
ditioned on K. Using Bayes' rule and Eq. ([5]), we write 

P{n',d\K) d^-i f d \ 
P\d\n,K) = ——— = jr[- — r . 3 

The average metric distance between two connected ver- 
tices with expected degrees k and k' is then 

d{K,K') = {jiKK')o I x^r{x)dx, (4) 

where Xc = dc{N){^KK')^^^^ and dc{N) is the maximum 
distance between nodes in the metric space, dc{N) ^ 

Ifr(a;) = [l + x)^" with a > D + 1, then the integral in 
Eq. (HI) is bounded and we observe positive correlations 
between degrees and distances: the higher the node de- 
gree, the longer the characteristic distances that it covers 
by its connections, which is exactly the property guaran- 
teeing GR efficiency. If£'<Q!<l-|--D, the integral in 
Eq. ^ diverges and we obtain 

din,K') ^ {flKK')^-^[dciN)f (5) 

In the limit a ^ D, d{K, k') loses any dependence on n 
and k', and becomes a large value diverging in the ther- 
modynamic limit. As a consequence, degrees and dis- 
tances are no longer correlated. The furthest neighbor 
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FIG. 2: Left: average length of GR paths in generated net- 
works with different values of 7 as a function of the system 
size A^. Solid lines are fits of the form A\ + A2 In [In + A3]. 
Right: parameter A2 obtained from the fit of r(A^) compared 
to the theoretical prediction A2 = | In (7 — 2)|^^. 



no longer tends to have the highest degree. These argu- 
ments explain why the network cannot be navigated if it 
loses its metric properties and clustering vanishes. 

We now shift our attention entirely to navigable net- 
works with a > D ~\- \ and compute the lengths of 
greedy paths in them. As the first step, we calculate 
the maximum expected degree Kc.n„(K) among all neigh- 
bors of a node of expected degree k. In a finite-size 
network, the variable k, is bounded by a natural cut-off 
Kc 7V1/(t-i) [1^. This cut-off is calculated as the value 
oi K — such that we expect to find only one node with 
K > Kc out of a sample of N vertices, N p{K)dK ^ 1. 
Following the same reasoning, the value of Kc,nn(K) can 
be evaluated as 



P{K'\K)dK' 1, 



which leads to 



K < K 



7-2 



K > K 



7-2 



(6) 



(7) 



This result, together with Eq. (jH), yields the following 
expression for the average distance to the next node along 
a GR path from a node of degree k 



for 



K < K 



7-2 



(8) 



Eqs. (|7l8p turn out to be central to our analysis. First 
we see from Eq. ([7]) that only if 1/(7 — 2) > 1, i.e., if 
7 < 3, the degree of the next node along a GR path is, 
on average, higher than the degree of the current node. 
This property explains why only scale-free networks with 
7 < 3 are navigable. 

Eq. ([5]) also shows that the expected distance be- 
tween the next node and the current node of degree 
K ~ Ar(7-2)/(7-i) is ^ ]^i/D^ ^^^^Yi is of the or- 
der of the maximum distance between all nodes in the 
metric space. In other words, we can cross the entire net- 
work in a single hop, landing at a node located at a finite 
and size-independent distance from the target. Putting 



these observations together, we conclude that the time 
to reach a target from a low-degree source located far 
away N^/D) from the target is roughly the number 
of hops that it takes to reach a node of expected degree 
K ^ ^ which is a size-dependent contribu- 

tion, plus the number of hops needed to cover a finite 
distance from this node to the target, which is a size- 
independent contribution. 

Following these observations, we iterate Eq ([7]) 



7-2 



to find the value of r such that 
solution is 



0,1,--- (9) 

^ ]\f{i-^)/h-i) _ The 



In [luN + B] 
|ln(7-2)| ■ 



(10) 



where A and B are functions of 7 and (k) . 

This result is remarkable in many respects. First, 
in the large-size limit we obtain f ~ InlnA^, meaning 
that greedy paths are ultra-short. Second, the prefactor 
in front of the logarithm is just a function of 7, sur- 
prisingly independent of the average degree. Finally, 
this prefactor is equal to the prefactor of the average 
shortest-path lengths in scale-free networks P, Q • It was 
also shown in P, Q that fluctuations around the aver- 
age shortest-path lengths are constant. Therefore, in the 
thermodynamic limit the shortest-path length distribu- 
tion becomes a delta function. This fact, together with 
the equality between the average shortest- and greedy- 
path lengths, implies that for N ^ 1 the distribution 
of greedy-path lengths also converges to the same delta 
function. Consequently, in large networks, all greedy 
paths are shortest paths. 

To check the accuracy of our theory, we perform ex- 
tensive numerical simulations for the model with D — 1 
(a circle) and a = 00, which is equivalent of taking 
r{x) = e"^ and /i = l/(2(fc)). We also fix the mini- 
mum expected degree to kq = 2. We note that param- 
eters kq and S are dummy and can be set to arbitrary 
values; the only independent parameters in the model 
are the average degree (k), the degree exponent 7, and 
clustering strength a. Fixing Kg = 2 helps to generate 
networks that are fully connected almost surely. If kq is 
fixed to a constant, then the average degree depends on 
7 as {k) — (-f — 1)ko/(7 — 2). Varying (k) is desirable as 
it allows us to directly check with simulations if there is 
indeed no dependency on (k) of the prefactor in Eq. pO|) . 
We also verified that networks with a fixed average degree 
yield the same results. 

Once a network with these parameters is generated, 
we simulate the GR process by choosing at random a 
source and destination, and forwarding at each node to 
the node's neighbor closest to the destination on the cir- 
cle. The number of source-destination pairs is 10^, and 
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the results are averaged over a number of network real- 
izations ranging between 400 and 4000. This process is 
performed for different power-law exponents 7 and net- 
work sizes N. The average GR path length f is then 
computed as a function of N for different 7. 

The top plot in Fig. [5] shows the results of these sim- 
ulations. We then fit empirical f (iV) to a function of 
the form Ai + A2 In [In N + A3] , where the constants Ai , 
A2, and A3 are free parameters estimated using the least 
square fit to the data. The bottom plot in Fig. [2] shows 
the empirical estimate of the coefficient A2 compared 
with the theoretical prediction A2 = | In (7 — 2)!^^. The 
agreement is very good for the values of 7 close to 2 
and deteriorates as 7 approaches 3. This deterioration is 
a consequence of the mean field approximation that as- 
sumes that at each hop the degree increases, which is true 
only on average. In fact, for 7 approaching 3, there is a in- 
creasingly non-negligible probability of making a hop to- 
ward a smaller-degree node ^ . We have also checked [13] 
that the fluctuations of GR path lengths around their 
average, x/r^ — f^, stay constant with increasing N, or 
even slightly decrease for small 7. These observations 
confirm that in the thermodynamic limit the GR path 
length distribution converges to a delta function. 

In summary, we have shown that greedy routing finds 
asymptotically shortest paths in scale-free networks with 
strong clustering and power-law node degree distribution 
exponents 7 < 3. Given that topolo gies of many real net- 



ings have optimistic practical implications as they open 
up a possibility to find shortest-path routing strategies 
for the Internet that would not require any global topol- 
ogy knowledge. The requirement for routers to have and 
constantly update this knowledge is a major scalability 
bottleneck in the Internet today [22 1. 



works do have these properties [17|, [18|, [19[ , our findings 
imply, surprisingly, that even without any global knowl- 
edge of network topology, nodes in complex networks can 
propagate information along the shortest routes. In other 
words, topologies of many real networks have a peculiar 
structure that guarantees that the lack of global topo- 
logical awareness imposes asymptotically no impact on 
the structure of information fiows in the network: with 
or without the global topology knowledge, information 
can flow along the shortest routes. There are other, reg- 
ular networks, such as lattices, that also possess these 
properties, but they require specific embeddings into spe- 
cific spaces. Greedy routes on scale-free networks, on the 
other hand, are shortest regardless the specifics of a hid- 
den metric space or connection probability. 

Complex networks thus have the structure that allows 
them to perform, in the most efficient way, one of their 
most basic and common functions: to propagate or signal 
information to specific targets through a complex net- 
work maze whose global connectivity is unknown to any 
node. It remains an open question if real networks evolve 
to become navigable 2^ 21|, or which networks do have 
hidden metric spaces underneath and which do not. Even 
if such spaces exist, it may be quite challenging to iden- 
tify their exact structure. At the same time, our find- 
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